© Scott Robison 2021 all rights reserved.
For Chapter 11: Linear Models and Estimation by Least Squares, we will be covering the same material as the text book but likely not in the same order. I will list the page range for the whole of Chapter 11 but will go at my own pace and order, so please reference the text book at your own convenience
Chapter 11 pages 563-609 from the text.
Since, the least-squares estimate \(\widehat{Y}_i=\widehat{\beta}_0+\widehat{\beta}_1 X_i\) clearly is based off sample data and the point estimates of \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\) . It follows that interval estimation and hypothesis testing can be applied to give context to the quality of our estimates.
\[\widehat{Y}_i=\widehat{\beta}_0+\widehat{\beta}_1 X_i\]
Inference on \(\widehat{\beta}_i\)’s:
Let’s look back at our student height and weight data. Use height as the predictor and weight as the response.
X <- c(63,64,66,69,69,71,71,72,73,75);
Y <- c(127,121,142,157,162,156,169,165,181,208);
Student=1:10;
reg1=data.frame("Student ID"=Student,height=X,weight=Y)
cor(reg1$height,reg1$weight)
## [1] 0.9470984
fit=lm(weight~height, data=reg1)
summary(fit)
##
## Call:
## lm(formula = weight ~ height, data = reg1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.2339 -4.0804 -0.0963 4.6445 14.2158
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -266.5344 51.0320 -5.223 8e-04 ***
## height 6.1376 0.7353 8.347 3.21e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.641 on 8 degrees of freedom
## Multiple R-squared: 0.897, Adjusted R-squared: 0.8841
## F-statistic: 69.67 on 1 and 8 DF, p-value: 3.214e-05
confint(fit,level = .95)
## 2.5 % 97.5 %
## (Intercept) -384.214357 -148.854434
## height 4.441894 7.833269
sfit=summary(fit)
#a) Y-inter not equal to 0?
sfit$coefficients[1,1]/sfit$coefficients[1,2]
## [1] -5.222889
2*pt(sfit$coefficients[1,1]/sfit$coefficients[1,2],df = 10-2)
## [1] 0.0007997666
confint(fit,level = .95)
## 2.5 % 97.5 %
## (Intercept) -384.214357 -148.854434
## height 4.441894 7.833269
#b) Slope > 0?
sfit$coefficients[2,1]/sfit$coefficients[2,2]
## [1] 8.346638
1-pt(sfit$coefficients[2,1]/sfit$coefficients[2,2],df = 10-2)
## [1] 1.606882e-05
confint(fit,level = .95)
## 2.5 % 97.5 %
## (Intercept) -384.214357 -148.854434
## height 4.441894 7.833269
#c) Slope > 5?
(sfit$coefficients[2,1]-5)/sfit$coefficients[2,2]
## [1] 1.547023
1-pt((sfit$coefficients[2,1]-5)/sfit$coefficients[2,2],df = 10-2)
## [1] 0.08022232
confint(fit,level = .95)
## 2.5 % 97.5 %
## (Intercept) -384.214357 -148.854434
## height 4.441894 7.833269
How strong is the linear relationship between the height of a student and his or her grade point average? Data were collected on a random sample of \(n = 35\) students in a statistics course at Penn State University.
height=c(66,57,64.5,62,69.5,65,63,68,59.5,64,69,70,66,73,69,61,68,72,72,73.5,63,67,64,67,70,68,61.5,67,71.5,71,63,60.5,74,72,67);
gpa=c(2.9,3.16,3.62,2,3.45,2.8,3.63,2.81,3.33,2.75,3.86,1.5,2.49,3.1,2.7,1.94,2.13,2.84,2.85,3.33,3,3.23,3.59,3.74,2.6,3.39,3.49,2.79,3.2,3.2,3.3,2.7,2.5,2.89,3.2);
reg2=data.frame(height,gpa)
reg2
fit2=lm(gpa~height, data=reg2)
cor(reg2$height,reg2$gpa)
## [1] -0.05324126
summary(fit2)
##
## Call:
## lm(formula = gpa ~ height, data = reg2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.45081 -0.24878 0.00325 0.35622 0.90263
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.410214 1.434616 2.377 0.0234 *
## height -0.006563 0.021428 -0.306 0.7613
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5423 on 33 degrees of freedom
## Multiple R-squared: 0.002835, Adjusted R-squared: -0.02738
## F-statistic: 0.09381 on 1 and 33 DF, p-value: 0.7613
plot(fit2)
confint(fit2,level = .95)
## 2.5 % 97.5 %
## (Intercept) 0.49146481 6.32896273
## height -0.05015819 0.03703227
predict(fit2,data.frame(height=70))
## 1
## 2.950807
Consider the following null and alternative hypothesis:
\[\begin{align} &H_0: \beta_1=\beta_2=\beta_3=...=0\\ &H_a:\text{at least one }\beta_i \ne 0\text{, where }i=1,2,...\\ \end{align}\]
Here in SLR there is only one slope \(\beta_1\)
\[\begin{align} H_0:\beta_1=0\\ H_a:\beta_1\ne 0\\ \end{align}\]
As opposed to testing this using the T -test, one can use the ‘variance’ decomposition in the response variable, encountered in determining the coefficient of determination and the standard deviation in the regression, to derive a test for linear appropriateness. Recall the variance decomposition we used to in computing the coefficient of determination:
Verify the F test of linear appropriateness for the student height and weight data is the same as the T-test for \(\widehat{\beta}_1=0\).
X <- c(63,64,66,69,69,71,71,72,73,75);
Y <- c(127,121,142,157,162,156,169,165,181,208);
reg1=data.frame(height=X,weight=Y)
fit=lm(weight~height, data=reg1)
anova(fit)
Proof given in text. Continuing our example:
X <- c(63,64,66,69,69,71,71,72,73,75);
Y <- c(127,121,142,157,162,156,169,165,181,208);
reg1=data.frame(height=X,weight=Y)
fit=lm(weight~height, data=reg1)
confint(fit,level = .95)
## 2.5 % 97.5 %
## (Intercept) -384.214357 -148.854434
## height 4.441894 7.833269
predict(fit,data.frame(height=70))
## 1
## 163.0963
predict(fit,data.frame(height=70),se.fit=T,interval="confidence",level=.95)
## $fit
## fit lwr upr
## 1 163.0963 156.684 169.5086
##
## $se.fit
## [1] 2.780697
##
## $df
## [1] 8
##
## $residual.scale
## [1] 8.641368
Proof given in text. Continuing our example:
X <- c(63,64,66,69,69,71,71,72,73,75);
Y <- c(127,121,142,157,162,156,169,165,181,208);
reg1=data.frame(height=X,weight=Y)
fit=lm(weight~height, data=reg1)
predict(fit,data.frame(height=70),se.fit=T,interval="prediction",level=.95)
## $fit
## fit lwr upr
## 1 163.0963 142.163 184.0296
##
## $se.fit
## [1] 2.780697
##
## $df
## [1] 8
##
## $residual.scale
## [1] 8.641368